exposure bias
Mitigating Exposure Bias in Risk-Aware Time Series Forecasting with Soft Tokens
Namazi, Alireza, Fathkouhi, Amirreza Dolatpour, Shakeri, Heman
Autoregressive forecasting is central to predictive control in diabetes and hemodynamic management, where different operating zones carry different clinical risks. Standard models trained with teacher forcing suffer from exposure bias, yielding unstable multi-step forecasts for closed-loop use. We introduce Soft-Token Trajectory Forecasting (SoTra), which propagates continuous probability distributions (``soft tokens'') to mitigate exposure bias and learn calibrated, uncertainty-aware trajectories. A risk-aware decoding module then minimizes expected clinical harm. In glucose forecasting, SoTra reduces average zone-based risk by 18\%; in blood-pressure forecasting, it lowers effective clinical risk by approximately 15\%. These improvements support its use in safety-critical predictive control.
- North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
- North America > United States > New Jersey > Hudson County > Hoboken (0.04)
- Europe > United Kingdom > England (0.04)
ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching
Huang, Guanbo, Mao, Jingjia, Huang, Fanding, Liu, Fengkai, Luo, Xiangyang, Liang, Yaoyuan, Lu, Jiasheng, Wang, Xiaoe, Liu, Pei, Fu, Ruiliu, Huang, Shao-Lun
Despite tremendous recent progress, Flow Matching methods still suffer from exposure bias due to discrepancies in training and inference. This paper investigates the root causes of exposure bias in Flow Matching, including: (1) the model lacks generalization to biased inputs during training, and (2) insufficient low-frequency content captured during early denoising, leading to accumulated bias. Based on these insights, we propose ReflexFlow, a simple and effective reflexive refinement of the Flow Matching learning objective that dynamically corrects exposure bias. ReflexFlow consists of two components: (1) Anti-Drift Rectification (ADR), which reflexively adjusts prediction targets for biased inputs utilizing a redesigned loss under training-time scheduled sampling; and (2) Frequency Compensation (FC), which reflects on missing low-frequency components and compensates them by reweight-ing the loss using exposure bias. ReflexFlow is model-agnostic, compatible with all Flow Matching frameworks, and improves generation quality across datasets. Experiments on CIF AR-10, CelebA-64, and ImageNet-256 show that ReflexFlow outperforms prior approaches in mitigating exposure bias, achieving a 35.65% reduction in FID on CelebA-64.
- Europe > United Kingdom > North Sea > Southern North Sea (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
Regularized Schrödinger Bridge: Alleviating Distortion and Exposure Bias in Solving Inverse Problems
Yao, Qing, Gao, Lijian, Mao, Qirong, Dong, Ming
Diffusion models serve as a powerful generative framework for solving inverse problems. However, they still face two key challenges: 1) the distortion-perception tradeoff, where improving perceptual quality often degrades reconstruction fidelity, and 2) the exposure bias problem, where the training-inference input mismatch leads to prediction error accumulation and reduced reconstruction quality. In this work, we propose the Regularized Schr odinger Bridge (RSB), an adaptation of Schr odinger Bridge tailored for inverse problems that addresses the above limitations. RSB employs a novel regularized training strategy that perturbs both the input states and targets, effectively mitigating exposure bias by exposing the model to simulated prediction errors and also alleviating distortion by well-designed interpolation via the posterior mean. Extensive experiments on two typical inverse problems for speech enhancement demonstrate that RSB outperforms state-of-the-art methods, significantly improving distortion metrics and effectively reducing exposure bias.
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > Canada (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > Canada (0.04)
- Europe > Denmark > Capital Region > Copenhagen (0.04)
- Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers
Wang, Juncheng, Xu, Chao, Yu, Cheng, Hu, Zhe, Xie, Haoyu, Yu, Guoqi, Shang, Lei, Wang, Shujun
While language models (LMs) paired with residual vector quantization (RVQ) tokenizers have shown promise in text-to-audio (T2A) generation, they still lag behind diffusion-based models by a non-trivial margin. We identify a critical dilemma underpinning this gap: incorporating more RVQ layers improves audio reconstruction fidelity but exceeds the generation capacity of conventional LMs. To address this, we first analyze RVQ dynamics and uncover two key limitations: 1) orthogonality of features across RVQ layers hinders effective LMs training, and 2) descending semantic richness in tokens from deeper RVQ layers exacerbates exposure bias during autoregressive decoding. Based on these insights, we propose Siren, a novel LM-based framework that employs multiple isolated transformers with causal conditioning and anti-causal alignment via reinforcement learning. Extensive experiments demonstrate that Siren outperforms both existing LM-based and diffusion-based T2A systems, achieving state-of-the-art results. By bridging the representational strengths of LMs with the fidelity demands of audio synthesis, our approach repositions LMs as competitive contenders against diffusion models in T2A tasks. Moreover, by aligning audio representations with linguistic structures, Siren facilitates a promising pathway toward unified multi-modal generation frameworks.
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia > China > Hong Kong (0.04)
- (2 more...)
NoiseShift: Resolution-Aware Noise Recalibration for Better Low-Resolution Image Generation
He, Ruozhen, Haji-Ali, Moayed, Yang, Ziyan, Ordonez, Vicente
Text-to-image diffusion models trained on a fixed set of resolutions often fail to generalize, even when asked to generate images at lower resolutions than those seen during training. High-resolution text-to-image generators are currently unable to easily offer an out-of-the-box budget-efficient alternative to their users who might not need high-resolution images. We identify a key technical insight in diffusion models that when addressed can help tackle this limitation: Noise schedulers have unequal perceptual effects across resolutions. The same level of noise removes disproportionately more signal from lower-resolution images than from high-resolution images, leading to a train-test mismatch. We propose NoiseShift, a training-free method that recalibrates the noise level of the denoiser conditioned on resolution size. NoiseShift requires no changes to model architecture or sampling schedule and is compatible with existing models. When applied to Stable Diffusion 3, Stable Diffusion 3.5, and Flux-Dev, quality at low resolutions is significantly improved. On LAION-COCO, NoiseShift improves SD3.5 by 15.89%, SD3 by 8.56%, and Flux-Dev by 2.44% in FID on average. On CelebA, NoiseShift improves SD3.5 by 10.36%, SD3 by 5.19%, and Flux-Dev by 3.02% in FID on average. These results demonstrate the effectiveness of NoiseShift in mitigating resolution-dependent artifacts and enhancing the quality of low-resolution image generation.
Counterfactual Risk Minimization with IPS-Weighted BPR and Self-Normalized Evaluation in Recommender Systems
Learning and evaluating recommender systems from logged implicit feedback is challenging due to exposure bias. While inverse propensity scoring (IPS) corrects this bias, it often suffers from high variance and instability. In this paper, we present a simple and effective pipeline that integrates IPS-weighted training with an IPS-weighted Bayesian Personalized Ranking (BPR) objective augmented by a Propensity Regularizer (PR). We compare Direct Method (DM), IPS, and Self-Normalized IPS (SNIPS) for offline policy evaluation, and demonstrate how IPS-weighted training improves model robustness under biased exposure. The proposed PR further mitigates variance amplification from extreme propensity weights, leading to more stable estimates. Experiments on synthetic and MovieLens 100K data show that our approach generalizes better under unbiased exposure while reducing evaluation variance compared to naive and standard IPS methods, offering practical guidance for counterfactual learning and evaluation in real-world recommendation settings.
- North America > United States (0.41)
- Europe > Czechia > Prague (0.06)
On the Reliability of Sampling Strategies in Offline Recommender Evaluation
Pereira, Bruno L., Said, Alan, Santos, Rodrygo L. T.
Offline evaluation plays a central role in benchmarking recommender systems when online testing is impractical or risky. However, it is susceptible to two key sources of bias: exposure bias, where users only interact with items they are shown, and sampling bias, introduced when evaluation is performed on a subset of logged items rather than the full catalog. While prior work has proposed methods to mitigate sampling bias, these are typically assessed on fixed logged datasets rather than for their ability to support reliable model comparisons under varying exposure conditions or relative to true user preferences. In this paper, we investigate how different combinations of logging and sampling choices affect the reliability of offline evaluation. Using a fully observed dataset as ground truth, we systematically simulate diverse exposure biases and assess the reliability of common sampling strategies along four dimensions: sampling resolution (recommender model separability), fidelity (agreement with full evaluation), robustness (stability under exposure bias), and predictive power (alignment with ground truth). Our findings highlight when and how sampling distorts evaluation outcomes and offer practical guidance for selecting strategies that yield faithful and robust offline comparisons.
- North America > United States > New York > New York County > New York City (0.17)
- North America > United States > Georgia > Fulton County > Atlanta (0.14)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
- (23 more...)
Counterfactual Reciprocal Recommender Systems for User-to-User Matching
Kawamura, Kazuki, Udagawa, Takuma, Tateno, Kei
Reciprocal recommender systems (RRS) in dating, gaming, and talent platforms require mutual acceptance for a match. Logged data, however, over-represents popular profiles due to past exposure policies, creating feedback loops that skew learning and fairness. We introduce Counterfactual Reciprocal Recommender Systems (CFRR), a causal framework to mitigate this bias. CFRR uses inverse propensity scored, self-normalized objectives. Experiments show CFRR improves NDCG@10 by up to 3.5% (e.g., from 0.459 to 0.475 on DBLP, from 0.299 to 0.307 on Synthetic), increases long-tail user coverage by up to 51% (from 0.504 to 0.763 on Synthetic), and reduces Gini exposure inequality by up to 24% (from 0.708 to 0.535 on Synthetic). CFRR offers a promising approach for more accurate and fair user-to-user matching.
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- (6 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.94)